Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some path logic to work on Windows #2704

Merged
merged 29 commits into from
Jan 18, 2024
Merged

Conversation

jchadwick-buf
Copy link
Member

This does not fix everything, so Windows CI is still not passing. However, this should still be closer.

I think ExternalPath is supposed to be unnormalized, so the tests that test for it are fixed to use unnormalized paths.

osRootBucket, err := storageosProvider.NewReadWriteBucket(
string(os.PathSeparator),
inputDirPathComponents[0],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't right/isn't what the intended behavior is for unix, can comment later

Copy link
Member Author

@jchadwick-buf jchadwick-buf Jan 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sake of posterity this behavior is identical on UNIX to the previous behavior because Components is called on the absolute path and looks like this:

[]string{"/", "Users", "john", "Code", "buf", "private", "buf", "cmd", "buf", "testdata", "fail2"}

So the bucket on UNIX is always "/" but on Windows could be a volume name (drive letter, UNC path.)
I'm not really opposed to refactoring it anyways (this is a pretty ugly approach at the end of the day,) just worth noting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think using normalpath.Components in this way is somewhat reasonable, since it's parsing an absolute path, and so on UNIX, the behaviour is very safe. However, as you noted, the Windows path could be either fully-qualified or UNC, and I think this creates some instability. We do handle the first component in normalpath to set the volume:

if len(volumeComponent) > 0 {
// On Windows the volume of an absolute path could be one of the following 3 forms
// c.f. https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file?redirectedfrom=MSDN#fully-qualified-vs-relative-paths
// * A disk designator: `C:\`
// * A UNC Path: `\\servername\share\`
// c.f. https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-dfsc/149a3039-98ce-491a-9268-2f5ddef08192
// * A "current volume absolute path" `\`
// This refers to the root of the current volume
//
// We do not support paths with string parsing disabled such as
// `\\?\path`
//
// If we did extract a volume name, we need to add a path separator to turn it into
// a path component. Volume Names without path separators have an implied "current directory"
// when performing a join operation, or using them as a path directly, which is not the
// intention of `Split` so we ensure they always mean "the root of this volume".
volumeComponent = volumeComponent + stringOSPathSeparator
}

I think the concern is actually when we're given a UNC path. So I think the better route to go is to explicitly parse the drive and pass that through as our path, instead of deriving it from the arbitrary path structure.

Copy link
Member

@bufdev bufdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Blocking as this isn't what the fix should be unless I'm missing something

@doriable doriable marked this pull request as draft January 10, 2024 23:12
@doriable
Copy link
Member

I put this in draft because I have no resolved all the issues yet, but would like to push up some changes. Will un-draft and request reviews once these issues are resolved.

// USERPROFILE shows the current user's profile folder. It is being used to parse the volume
// for the current user and this is used as the process's context.
// https://learn.microsoft.com/en-us/windows/deployment/usmt/usmt-recognized-environment-variables#variables-that-are-processed-for-the-operating-system-and-in-the-context-of-each-user
FSRoot = filepath.VolumeName(os.Getenv("USERPROFILE")) + string(os.PathSeparator)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I retract my comment earlier, I think it isn't feasible to parse this through the system env vars -- the process could just be running in an entirely separate drive, which is what is happening in the actions. We should use normalpath.Components when parsing the absolute paths and document clearly the behaviours of how reader.go is constructing the OS root buckets.

@@ -205,13 +206,15 @@ func newDecodeError(fileName string, err error) error {
fileName = "config file"
}
// We intercept PathErrors in buffetch to deal with fixing of paths.
return &fs.PathError{Op: "decode", Path: fileName, Err: err}
// We return a cleaned, unnormalized path in the error for clarity with user's filesystem.
return &fs.PathError{Op: "decode", Path: filepath.Clean(normalpath.Unnormalize(fileName)), Err: err}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fileName isn't necessarily a path I believe

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case where it isn't a path (e.g. fileName = "config file" or some other case), normalpath.Unnormalize and filepath.Clean would be no-ops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make a decision as to whether paths should be unnormalized in fs.PathErrors - I've generally taken the position that they should not be in the rest of the codebase, I'm worried that we're special-casing here, any thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with either -- the unnormalized error was mostly for UX reasons (matching file paths, and we did show unnormalised file paths in the past, etc.) but as noted, they're not always a file path, and consistency is kind of the key. So I'm totally okay with keeping them as normalised paths if this is the pattern throughout the new libraries, in which case, I can adjust the tests to reflect that also.

Copy link
Member

@doriable doriable left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Added TODOs for all WIndows tests not fixed here:
    • Formatter diffs
    • b5 digest issues
    • Duplicate path separators, even after filepath.Clean
    • Carriage returns affecting output
    • Archive test concurrency

@doriable doriable marked this pull request as ready for review January 16, 2024 22:29
@jchadwick-buf jchadwick-buf mentioned this pull request Jan 17, 2024
@doriable
Copy link
Member

doriable commented Jan 17, 2024

Turns out a large portion of our issues were due to a .gitattributes change that did not allow git to convert line endings on checkout for testdata. In the past, we set **/testdata/** files to binary, which forces git to treat them as binary.
The change set -crlf instead, but this assumes that git does not check them out as text files.
So now, we are setting -text explicitly so that they are not being handled by text=auto setting, even if they are being checked out as text files. This fixes many of the issues caused by carriage returns.

There are still a couple of remaining TODO's to investigate:

  • duplicated path separators
  • archive test concurrency issues

Copy link
Member

@bufdev bufdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@doriable and I chatted and we're going to merge this in as-is, and @doriable is going to fix up the remaining tests on a new PR.

@bufdev bufdev merged commit 606fc67 into bufmod Jan 18, 2024
7 checks passed
@bufdev bufdev deleted the jchadwick/bufmod-windows-fixes branch January 18, 2024 17:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants